13 research outputs found
Diluting the Scalability Boundaries: Exploring the Use of Disaggregated Architectures for High-Level Network Data Analysis
Traditional data centers are designed with a rigid architecture of
fit-for-purpose servers that provision resources beyond the average workload in
order to deal with occasional peaks of data. Heterogeneous data centers are
pushing towards more cost-efficient architectures with better resource
provisioning. In this paper we study the feasibility of using disaggregated
architectures for intensive data applications, in contrast to the monolithic
approach of server-oriented architectures. Particularly, we have tested a
proactive network analysis system in which the workload demands are highly
variable. In the context of the dReDBox disaggregated architecture, the results
show that the overhead caused by using remote memory resources is significant,
between 66\% and 80\%, but we have also observed that the memory usage is one
order of magnitude higher for the stress case with respect to average
workloads. Therefore, dimensioning memory for the worst case in conventional
systems will result in a notable waste of resources. Finally, we found that,
for the selected use case, parallelism is limited by memory. Therefore, using a
disaggregated architecture will allow for increased parallelism, which, at the
same time, will mitigate the overhead caused by remote memory.Comment: 8 pages, 6 figures, 2 tables, 32 references. Pre-print. The paper
will be presented during the IEEE International Conference on High
Performance Computing and Communications in Bangkok, Thailand. 18 - 20
December, 2017. To be published in the conference proceeding
A Convolve-And-MErge Approach for Exact Computations on High-Performance Reconfigurable Computers
This work presents an approach for accelerating arbitrary-precision arithmetic on high-performance reconfigurable computers (HPRCs). Although faster and smaller, fixed-precision arithmetic has inherent rounding and overflow problems that can cause errors in scientific or engineering applications. This recurring phenomenon is usually referred to as numerical nonrobustness. Therefore, there is an increasing interest in the paradigm of exact computation, based on arbitrary-precision arithmetic. There are a number of libraries and/or languages supporting this paradigm, for example, the GNU multiprecision (GMP) library. However, the performance of computations is significantly reduced in comparison to that of fixed-precision arithmetic. In order to reduce this performance gap, this paper investigates the acceleration of arbitrary-precision arithmetic on HPRCs. A Convolve-And-MErge approach is proposed, that implements virtual convolution schedules derived from the formal representation of the arbitrary-precision multiplication problem. Additionally, dynamic (nonlinear) pipeline techniques are also exploited in order to achieve speedups ranging from 5x (addition) to 9x (multiplication), while keeping resource usage of the reconfigurable device low, ranging from 11% to 19%
FSM Decomposition for Low Power in FPGA
IN this paper, the realization of low power finite state machines (FSMs) on FPGAs using decomposition techniques is addressed. The original FSM is divided into two submachines using a probabilistic criterion. Only one submachine is active at a time, meanwhile the other is disabled to save power. Differen
Implementation of an OBS access node supporting multiple services
Network operators want to deliver multiple services to the end customers. This leads to the need of efficient transport of data using optical network technologies. OBS is a promising technology for metro networks, where the operator can locate their servers to provide services such as video, backup or PC virtualization. These services are competing for shared network bandwidth when running in parallel. This work develops an FPGA-based access edge node, which operates with multiple QoS applications. This paper describes the architecture of the design as well as the behaviour of the implemented solution. © 2012 IEEE